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Quantum information processing, science of - The theoretical, experimental and 
technological areas covering the use of quantum mechanics for communication and computa- 
tion. 

Kluwer Encyclopedia of Mathematics, Supplement III 

Research of the last few decades has established that quantum information, or information based on 
quantum mechanics, has capabilities that exceed those of traditional "classical" information. For exam- 
ple, in communication, quantum information enables quantum cryptography [1, 2], which is a method 
for communicating in secret. Secrecy is guaranteed because eavesdropping attempts necessarily disturb 
the exchanged quantum information without revealing the content of the communication. In computation, 
quantum information enables efficient simulation of quantum physics [3], a task for which general pur- 
pose, efficient, classical algorithms are not known to exist. Quantum information also leads to efficient 
algorithms for factoring of large numbers [4, 5], which is believed to be difficult for classical computers. 
An efficient factoring algorithm would break the security of commonly used public key cryptographic 
codes used for authenticating and securing internet communications. A fourth application of quantum 
information improves the efficiency with which unstructured search problems can be solved [6] . Quantum 
unstructured search may make it possible to solve significantly larger instances of optimization problems 
such as the scheduling and traveling salesman problems. 

As a result of the capabilities of quantum information, the science of quantum information processing 
is now a prospering, interdisciplinary field focused on better understanding the possibilities and limitations 
of the underlying theory, on developing new applications of quantum information and on physically real- 
izing controllable quantum devices. The purpose of this primer is to provide an elementary introduction 
to quantum information processing (Sect. 2), and then to briefly explain how we hope to exploit the ad- 
vantages of quantum information (Sect. 3). These two sections can be read independently. For reference, 
we have included a glossary (Sect. 4) of the main terms of quantum information. 

When we use the word "information", we generally think of the things we can talk about, broadcast, 
write down, or otherwise record. Such records can exist in many forms, such as sound waves, electrical 
signals in a telephone wire, characters on paper, pit patterns on an optical disk, or magnetization on a 
computer hard disk. A crucial property of information is that it is "fungible": It can be represented in many 
different physical forms and easily converted from one form to another without changing its meaning. In 
this sense information exists independently of the devices used to represent it, but requires at least one 
physical representation to be useful. 

We call the familiar information stored in today's computers "classical" or "deterministic" to distin- 
guish it from quantum information. It is no accident that classical information is the basis of all human 
knowledge. Any information passing through our senses is best modeled by classical discrete or con- 
tinuous information. Therefore, when considering any other kind of information, we need to provide a 
method for extracting classically meaningful information. We begin by recalling the basic ideas of clas- 
sical information in a way that illustrates the general procedure for building an information processing 
theory. 
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1 Classical Information 



The basic unit of classical deterministic information is the "bit". A bit is an abstract entity or "system" 
that can be in one of the two states symbolized by o and 1. At this point, the symbols for the two states 
have no numeric meaning. That is why we have used a font different from that for the numbers and 1. 
By making a clear distinction between the bit and its states we emphasize that a bit should be physically 
realized as a system or device whose states correspond to the ideal bit's states. For example, if you are 
reading this primer on paper, the system used to realize a bit is a reserved location on the surface, and the 
state depends on the pattern of ink (o or l) in that location. In a computer, the device realizing a bit can be 
a combination of transistors and other integrated circuit elements with the state of the bit determined by 
the distribution of charge. 

In order to make use of information it must be possible to manipulate (or "process") the states of 
information units. The elementary operations that can be used for this purpose are called "gates". Two 
one-bit gates are the not and the reset gates. Applying the not gate to a bit has the effect of "flipping" 
the state of the bit. For example, if the initial state of the bit is o, then the state after applying not is 
not(o) = l. We can present the effect of the gate in the following form: 

Initial State Final State 

— > not(o) = l, (1) 

1 — > not(i) = o. 

The reset gate sets the state to o regardless of the input: 

Initial State Final State 

— > reset (o) = o, (2) 

1 — > reset(i) = o. 

By applying a combination of not and reset gates one can transform the state of a bit in every possible 
way. 

Information units can be combined to represent more information. Bits are typically combined into 
sequences. The states of such a sequence are symbolized by strings of state symbols for the constituent 
bits. For example a two-bit sequence can be in one of the following four states: oo, oi, 10 and n. The 
different bits are distinguished by their position in the sequence. 

The one-bit gates can be applied to any bit in a sequence. For example, the not gate applied to the 
second bit of a three-bit sequence in the state on changes the state to ooi. 

One-bit gates act independently on each bit. To compute with multiple bits, we need gates whose 
action can correlate the states of two or more bits. One such gate is the nand ("not and") gate, which acts 
on two bits in a bit sequence. Its effect is to set the state of the first bit to o if both the first and the second 
bit are l, otherwise it sets it to l. Here is what happens when nand is applied to two consecutive bits: 



Initial State Final State 

00 — > nand(oo) = 10, 

01 — > nand(oi) = n, (3) 
10 — > nand(io) = 10, 

n — > nand(n) = oi. 
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The nand gate can be applied to any two bits in a sequence. For example, it can be applied to the fourth 
and second bits (in this order) of four bits, in which case the initial state 1101 is transformed to 1100, 
setting the fourth bit to o. 

Other operations on bit sequences include adding a new bit to the beginning (prepend) or end 
(append) of a bit sequence. The new bit is always initialized to o. It is also possible to discard the 
first or last bit, regardless of its state. Versions of these operations that are conditional on the state of 
another bit may also be used. An example is the conditional append operation: "if the fc'th bit is in the 
state o then append a bit." 

The gates just introduced suffice for implementing arbitrary state transformations of a given bit se- 
quence. Instructions for applying gates in a particular order are called a "circuit". An important part of 
investigations in information processing is to determine the minimum resources required to perform infor- 
mation processing tasks. For a given circuit, the two primary resources are the number of gates and the 
total number of bits used. The "circuit complexity" of a desired transformation is the minimum number of 
gates needed to implement it. 

The model of computation defined by the ability to apply gates in a fixed sequence is called the "circuit 
model". Classical computation extends the circuit model by providing a means for repeating blocks of 
instructions indefinitely or until a desired condition is achieved. In principle, it is possible to conceive 
of a general purpose computer as a device that repeatedly applies the same circuit to the beginnings of 
several bit sequences. In this introduction, we take for granted a traditional programmable computer 
based on classical information. Thus a "quantum algorithm" is a program written for such a computer 
with additional instructions for applying gates to quantum information. The computational power of this 
model is equivalent to that of other general purpose models of quantum computation, such as quantum 
Turing machines [7]. 

For an introduction to algorithms and their analysis, see [8]. A useful textbook on computational 
complexity with an introduction to classical computation and computational machine models is [9]. 

2 Quantum Information 

The foundations of an information processing theory can be constructed by the procedure we followed in 
the previous section: 

1 . Define the basic unit of information. 

2. Give the means for processing one unit. 

3. Describe how multiple units can be combined. 

4. Give the means for processing multiple units. 

5. Show how to convert the content of any of the extant units to classical information. 

Note that the last step was not required for classical information processing. 

In this section, we follow the general procedure for defining an information processing theory to in- 
troduce quantum information processing. A simple example that exhibits the advantages of quantum 
information is given in Sect. 2.8. A version of the quantum factoring algorithm is described in Sect. 2.10. 
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2.1 The Quantum Bit 



The fundamental resource and basic unit of quantum information is the quantum bit (qubit), which behaves 
like a classical bit enhanced by the superposition principle (see below). From a physical point of view, 
a qubit is represented by an ideal two-state quantum system. Examples of such systems include photons 
(vertical and horizontal polarization), electrons and other spin-| systems (spin up and down), and systems 
defined by two energy levels of atoms or ions. From the beginning the two-state system played a central 
role in studies of quantum mechanics. It is the most simple quantum system, and in principle all other 
quantum systems can be modeled in the state space of collections of qubits. 

From the information processing point of view, a qubit's state space contains the two "logical", or 
"computational", states |o) and |i). The so-called "ket" notation for these states was introduced by 
P. Dirac, and its variations are widely used in quantum physics. One can think of the pair of symbols 
"|" and ")" as representing the qubit system. Their content specifies a state for the system. In this context 
o and 1 are system-independent state labels. When, say, o is placed within the ket, the resulting expression 
|o) represents the corresponding state of a specific qubit. 

The initial state of a qubit is always one of the logical states. Using operations to be introduced later, 
we can obtain states which are "superpositions" of the logical states. Superpositions can be expressed 
as sums a\o) + (3\i) over the logical states with complex coefficients. The complex numbers a and (3 
are called the "amplitudes" of the superposition. The existence of such superpositions of distinguishable 
states of quantum systems is one of the basic tenets of quantum theory called the "superposition principle". 
Another way of writing a general superposition is as a vector 

a\o)+P\i)~ fy, (4) 

where the two-sided arrow 'W is used to denote the correspondence between expressions that mean the 
same thing. 

The qubit states that are superpositions of the logical states are called "pure" states: A superposition 
a\o) + (3\i) is a pure state if the corresponding vector has length 1, that is \a\ 2 + \{3\ 2 = 1. Such a 
superposition or vector is said to be "normalized". (For a complex number given by 7 = x + iy, one can 
evaluate I7I 2 = x 2 + y 2 . Here, x and y are the real and imaginary part of 7, and the symbol i is a square 
root of —1, that is, z 2 = —1. The conjugate of 7 is 7 = x — iy. Thus I7I 2 = 77.) Here are a few examples 
of states given in both the ket and the vector notation: 




The state ^3) is obtained from |^ 2 ) by multiplication with i. It turns out that two states cannot be 
distinguished if one of them is obtained by multiplying the other by a "phase" e ld . Note how we have 
generalized the ket notation by introducing expressions such as \ip) for arbitrary states. 
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The superposition principle for quantum information means that we can have states that are sums of 
logical states with complex coefficients. There is another, more familiar type of information whose states 
are combinations of logical states. The basic unit of this type of information is the probabilistic bit (pbit). 
Intuitively, a pbit can be thought of as representing the as-yet-undetermined outcome of a coin flip. Since 
we need the idea of probability to understand how quantum information converts to classical information, 
we briefly introduce pbits. 

A pbit's state space is a probability distribution over the states of a bit. One very explicit way to 
symbolize such a state is by using the expression {p:o, (1— which means that the pbit has probability 
p of being o and 1 — p of being 1. Thus a state of a pbit is a "probabilistic" combination of the two 
logical states, where the coefficients are nonnegative real numbers summing to 1. A typical example is the 
unbiased coin in the process of being flipped. If "tail" and "head" represent o and l, respectively, the coin's 
state is {|:o, After the outcome of the flip is known, the state "collapses" to one of the logical states 
o and l. In this way, a pbit is converted to a classical bit. If the pbit is probabilistically correlated with 
other pbits, the collapse associated with learning the pbit's logical state changes the overall probability 
distribution by a process called "conditioning on the outcome". 

A consequence of the conditioning process is that we never actually "see" a probability distribution. 
We only see classical deterministic bit states. According to the frequency interpretation of probabilities, 
the original probability distribution can only be inferred after one looks at many independent pbits in the 
same state {p:o, (1 — p):i}: In the limit of infinitely many pbits, p is given by the fraction of pbits seen 
to be in the state o. As we will explain, we can never "see" a general qubit state either. For qubits there 
is a process analogous to conditioning. This process is called "measurement" and converts qubit states to 
classical information. 

Information processing with pbits has many advantages over deterministic information processing with 
bits. One advantage is that algorithms are often much easier to design and analyze if they are probabilistic. 
Examples include many optimization and physics simulation algorithms. In some cases, the best avail- 
able probabilistic algorithm is more efficient than any known deterministic algorithm. An example is an 
algorithm for determining whether a number is prime or not. It is not known whether every probabilistic 
algorithm can be "derandomized" efficiently. There are important communication problems that can be 
solved probabilistically but not deterministically. For a survey, see [10]. 

What is the difference between bits, pbits and qubits? One way to visualize the difference and see the 
enrichment provided by pbits and qubits is shown in Fig. 1. 
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FIG. 1: Visual comparison of the state spaces of different information units. The states of a bit correspond 
to two points. The states of a pbit can be thought of as "convex" combinations of the states of a bit 
and therefore can be visualized as lying on the line connecting the two bit states. A qubit's pure states 
correspond to the surface of the unit sphere in three dimensions, where the logical states correspond to 
the poles. This representation of qubit states is called the "Bloch sphere". The explicit correspondence 
is discussed at the end of Sect. 2.7. See also the definition and use of the Bloch sphere in [11]. The 
correspondence between the pure states and the sphere is physically motivated and comes from a way of 
viewing a spin-| system as a small quantum magnet. Intuitively, a state is determined by the direction of 
the north pole of the magnet. 

2.2 Processing One Qubit 

The quantum version of the not gate for bits exchanges the two logical states. That is, using ket notation, 

not(a|o) + /3|i>) = a\i) + p\o) = p\o) + a\i). (8) 



notl )■ (9) 



In vector notation this equation becomes 



(3 1 \a 

Another way of expressing the effect of not is by multiplying the vector by a matrix representing not: 

-(?)-(! (9- 

so we that can identify the action of not with the matrix a x — ( ^ ) . An even simpler gate is the one 



x 1 , 

that does nothing. We call this the noop gate, and its matrix form is the identity matrix as shown in the 
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noop|«W 1 ?H2 ) = [?)■ (11) 



following equation: 

The noop and not gates are "reversible". In other words, we can undo their actions by applying other 
gates. For example, the action of the not gate can be undone by another not gate. The action of every 
reversible quantum gate can be represented by matrix multiplication, where the matrix has the additional 
property of preserving the length of vectors. Such matrices are called "unitary" and are characterized by 
the equation A^A = 1, where A^ is the conjugate transpose of A and 11 is the identity matrix. (The conju- 
gate transpose of a matrix is computed by flipping the matrix across the main diagonal and conjugating the 
complex numbers.) For gates represented by a matrix, the unitarity condition is necessary and sufficient 
for ensuring that pure states get mapped to pure states. 

Because qubit states can be represented as points on a sphere, reversible one-qubit gates can be thought 
of as rotations of the Bloch sphere. This is why such quantum gates are often called "rotations". As 
explained in detail in [11], rotations around the x, y and z axis are in a sense generated by the three Pauli 
matrices 

*' = ( i o ) ' "*= [ i "o j ' **= ( o -°i ' ' (12) 



each of which represents a one-qubit gate. For example, a rotation around the x-axis by an angle <p is 
given by e~ l<Tx ^ 2 = cos(0/2)l — i sm((p/2)a x . To obtain this identity, one can use the power series for 
e A , e A = Ylk^o an d exploit the fact that a 2 = 1 to simplify the expression. Here are some gates that 
can be defined with the help of rotations: 



90° x -rotation : rotx 



90° 



90° ^/-rotation : roty 90o = -4= 



z -rotation : rotz rf 





















-i<t>/2 












—2 
1 



(13) 





Ad>/2 



Hadamard gate : H = ( j ^ J 



The rotation gates often show up in controlling spins or ions with radio-frequency pulses or lasers. The 
Hadamard gate is used primarily by quantum programmers. It can be expressed as a product of a 90° 
y-rotation and a z . 

To check directly that the rotation gates are reversible one can determine their inverses. In this case 
and as expected, the inverse of a rotation is the rotation around the same axis in the opposite direction. For 
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example, the inverses of the roty 90 o and rotz^ gates are given by 



roty_ 



i M 1 



V2 \ -1 1 

(14) 



rotz_ 



e 1 *' 2 







-id>/2 



Another useful property of the rotation gates is that the angles add when rotations are applied around the 
same axis. For example, rotz^rotzg = rotz^+g. 

The ket notation can be extended so that we can write gates in a compact form that readily generalizes 
to multiple qubits. To do so we have to introduce expressions such as (^| = a(o| This is called the 

"bra" notation. The terminology comes from the term "bracket": The 'bra" is the left and the "ket" is the 
right part of a matched pair of brackets. From the vector point of view, (?p\ corresponds to the row vector 
(a, (3). Note that a column vector multiplied by a row vector yields a matrix. In the bra-ket notation, this 
corresponds to multiplying a ket \ip) by a bra written as Since this represents an operator on 

states, we expect to be able to compute the effect of on a state Iv 9 ) by forming the product. To be 

able to evaluate such products with one-qubit kets and bras, we need the following two rules. 



Distributivity. You can rewrite sums and products using distributivity. For example, 

(|<o| + ^<i|)i|i) = f <o||i) + |<i||i>. (15) 

Observe that we can combine the amplitudes of terms, but we cannot rearrange the order of the bras 
and kets in a product. 



Inner product evaluation. The product of a logical "bra" and a logical "ket" is evaluated according 
to the identities 

<o||o> = 1, 
(o||i> = 0, 
(i||o> = 0, 

<i||i> = 1. (16) 

It follows that for logical states, if a bra multiplies a ket, the result cancels unless the states match, 
in which case the answer is 1. Applying inner product evaluation to the example (Eq. 15) results in 

i3 , ,, , iA , , , , i3 i4 i4 

T {.||i> + T (i||i) = T + T l = T . (17) 

To simplify the notation, we can omit one of the two vertical bars in products such as (a\\b) and write 
(a\b). 

To understand inner product evaluation, think of the expressions as products of row and column vec- 
tors. For example, 
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That is, as vectors the two states |o) and |i) are orthogonal. In general, if \(f>) and \^p) are states, then 
(0|^>) is the "inner product" or "overlap" of the two states. In the expression for the overlap, (0| is 
computed from |0) = a\o) + P\i) by conjugating the coefficients and converting the logical kets to bras: 
(0| = a{o\ + j3{x\. In the vector representation, this is the conjugate transpose of the column vector for 
|0), so the inner product agrees with the usual one. Two states are orthogonal if their overlap is zero. We 
write 10)+ = (0| and (01+ = |0). 

Every linear operator on states can be expressed with the bra-ket notation. For example, the bra-ket 
expression for the noop gate is noop = |o) (o| + |i) (i|. To apply noop to a qubit, you multiply its state 
on the left by the bra-ket expression: 

noop(a|o> +/ 3|i>) = (|o)(o| + |i)(i|) (a|o> + /3|l> 



= |o>(o|^|o>+/3|i>j + |i>(i|^|o>+/3|i>^ 

= a|o)(o|o> +/?|o>(o|i> +a|i)(i|o) +/3|i)(i|i> 
= a|o)l +/3|o)0 + a|i)0 + /3|i)l 

= «|o)+/3|i> (19) 

One way to think about an operator such as \a)(b\ is to notice that when it is used to operate on a ket 
expression, the (b\ picks out the matching kets in the state, which are then changed to \a). For example, 
we can write the not operation as not = |o) (i| + |i) (o|. 

The coefficients of the \a) (b\ in a bra-ket representation of a gate correspond to matrix entries in the 
matrix representation. The relationship is defined by 

«oo|o>(o| + « i|o)(i| + a 10 |i)(o| + a n |i)(i| - ( " 00 " 01 ) . (20) 



2.3 Two Quantum Bits 

Some states of two quantum bits can be symbolized by the juxtaposition (or multiplication) of states of 
each quantum bit. In particular, the four logical states |o)|o), |o)|i), |i)|o), and |i)|i) are acceptable 
pure states for two quantum bits. In these expressions, we have distinguished the qubits by position (first 
or second). It is easier to manipulate state expressions if we explicitly name the qubits, say A and B. We 
can then distinguish the kets by writing, for example, \ip) A for a state of qubit A. Now the state |o) |i) can 
be written with explicit qubit names (or "labels") as 

|o) |i> = |i> |o> = |oi) = |io) . (21) 

1 'A 1 'B 1 'B 1 'A 1 'AB 1 'BA v ' 

Having explicit labels allows us to unambiguously reorder the states in a product of states belonging to 
different qubits. We say that kets for different qubits "commute". 

So far we have seen four states of two qubits, which are the logical states that correspond to the states 
of two bits. As in the case of one qubit, the superposition principle can be used to get all the other pure 
states. Each state of two qubits is therefore of the form 

a|oo) AR + /3|oiL+ 7 |ioL + 5|nL, (22) 
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where a, /3, 7, and 5 are complex numbers. Again, there is a column vector form for the state: 



P 



(23) 



7 



V * / 



and this vector has to be of unit length, that is |a| 2 + |/3| 2 + |7| 2 + |<5| 2 = 1. When using the vector form 
for qubit states, one has to be careful about the convention used for ordering the coefficients. 
Other examples of two-qubit states in ket notation are the following: 



The first two of these states have the special property that they can be written as a product |0i) A |02) B 
of a state of qubit A and a state of qubit B. The second expression for \ip 2 ) shows that the product 
decomposition is not always easy to see. Such states are called "product" states. The last two states, 1^3) 
and l^i) are two of the famous Bell states. They have no such representation as a product of independent 
states of each qubit. They are said to be "entangled" because they contain a uniquely quantum correlation 
between the two qubits. Pbits can also have correlations that cannot be decomposed into product states, 
but the entangled states have additional properties that make them very useful. For example, if Alice and 
Bob each have one of the qubits that together are in the state 1^3) , they can use them to create a secret 
bit for encrypting their digital communications. 

2.4 Processing Two Qubits 

The simplest way of modifying the state of two qubits is to apply one of the one-qubit gates. If the gates 
are expressed in the bra-ket notation, all we need to do is add qubit labels so that we know which qubit 
each bra or ket belongs to. For example, the not gate for qubit B is written as 



The labels for bra expressions occur as left superscripts. To apply expressions like this to states, we need 
one more rule: 




(24) 



not( B ) = |o) B B (i| + |i) B B (o|. 



(25) 



11 



Commutation. Kets and bras with different labels can be interchanged in products (they "com- 
mute"). This is demonstrated by the following example: 



(\o)y\) ioi> A 



i»> A i^i n 

|o)Jo> = |oo) ar . 



(26) 



Note that we cannot merge the two vertical bars in expressions such as b (i||o) a because the two terms 
belong to different qubits. The bars can only be merged when the expression is an inner product, which 
requires that the two terms belong to the same qubit. 

With the rules for bra-ket expressions in hand, we can apply the not gate to one of our Bell states to 
see how it acts: 



not (B) ^=(|oo) + In) 

lei \ 1 'AB 1 'A 



^(|o> b b (i|(|oo> ab + |ii) ab ) + |i»|(|oo> ab + |ii) a 



^0 



o> r b (iiiool + |o>:(i||h> a 



^(|o>Jo> b b (i||o> b + |i> a |o) b b (i||i> b + |o) a |i»||o) e 
^(|o> a |o> b + |i> a |o> b 1 + |o> a |i> b 1 + |i> a |i> b 

ifWJol + ioXiiiU^Oo^L + M 



-ll»IK 

14 K( 



x/2V ' A ' 



V2V 



(27) 



not( B ) 



(28) 



The effect of the gate was to flip the state symbols for qubit B, which results in another Bell state. 

The gate not^ B ^ can also be written as a 4 x 4 matrix acting on the vector representation of a two-qubit 
state. However, the relationship between this matrix and the one-qubit matrix is not as obvious as for the 
bra-ket expression. The matrix is 

/ 1 \ 
10 
1 
\ 1 / 

which swaps the top two and the bottom two entries of a state vector. 

One way to see the relationship between the one and the two-qubit representation of the gate not^ B ^ is 
to notice that because the noop gate acts as the identity, and because we can act on different qubits inde- 
pendently, noop^not^ ~ not( B \ The matrix for not^ 8 ^ can be expressed as a "Kronecker product" 
("<8>") of the matrices for noop and not: 



noop( A W( B ) 



1 
1 



1 

1 



12 



1 

1 



o( 1 
V V 1 

/ 1 \ 

10 

1 

\ 1 J 



1 

1 

1 

1 



(29) 



The Kronecker product of two matrices expands the first matrix by multiplying each entry by the second 
matrix. A disadvantage of the matrix representation of quantum gates is that it depends on the number and 
order of the qubits. However, it is often easier to visualize what the operation does by writing down the 
corresponding matrix. 

One cannot do much with one-bit classical gates. Similarly, the utility of one-qubit gates is limited. 
In particular, it is not possible to obtain a Bell state starting from |oo) ab or any other product state. We 
therefore need to introduce at least one two-qubit gate not expressible as the product of two one-qubit 
gates. The best known such gate is the "controlled-not" (cnot) gate. Its action can be described by 
the statement, "if the first bit is l, flip the second bit, otherwise do nothing". The bra-ket and matrix 
representations for this action are 



cnot( AB ) 



o> A A (o| + |i> A A (i 

/ 1 \ 
10 
1 

\ o o i o / 



o>!(i| + 



(30) 



The cnot gate is reversible because its action is undone if a second cnot is applied. This outcome is 
easy to see by computing the square of the matrix for cnot, which yields the identity matrix. As an 
exercise in manipulating bras and kets, let us calculate the product of two cnot gates by using the bra-ket 
representation. 

cn t( AB )cnot( AB ) = f|o>;(o| + |im|f|om| + |i>>|)') f|o>;(o| + |i);(i|f|o> R B (i| + |i)>0Y od 



The first step is to expand this expression by multiplying out. Expressions such as |o) A (o| |i) A (i| cancel 



because of the inner product evaluation rule, a (o|i) a 
labels and rewrite |o) A (o| |o) A (o| = |o) A (o| to get 



0. One can also reorder bras and kets with different 



cnot( AB >cnot( AB ) 



|o»| + |i> A A (i|(|o) B B (i| 
|o> A A (o| + |i> A A (i|(|o»| 
|o»| + |i) A A (i|noop( B ) 



1*1 



o>!(i| + 
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* |o»| + |i) A A (i| 
= noop^ 

- 1, (32) 

where we used the fact that when the bra-ket expression for noop is applied to the ket expression for a 
state it acts the same as (here denoted by the symbol "~") multiplication by the number 1. 



2.5 Using Many Quantum Bits 

To use more than two, say five, qubits, we can just start with the state |o) a |o) b |o) c |o) d |o) e and apply 
gates to any one or two of these qubits. For example, cnot^ DB ' applies the cnot operation from qubit 
D to qubit B. Note that the order of D and B in the label for the cnot operation matters. In the bra- 
ket notation, we simply multiply the state with the bra-ket form of cnot^ DB ^ from the left. One can 
express everything in terms of matrices and vectors, but now the vectors have length 2 5 = 32, and the 
Kronecker product expression for cnot^ 08 ) requires some reordering to enable inserting the operation so 
as to act on the intended qubits. Nevertheless, to analyze the properties of all reversible (that is, unitary) 
operations on these qubits, it is helpful to think of the matrices, because a lot of useful properties about 
unitary matrices are known. One important result from this analysis is that every matrix that represents 
a reversible operation on quantum states can be expressed as a product of the one- and two-qubit gates 
introduced so far. We say that this set of gates is "universal". 

For general purpose computation, it is necessary to have access to arbitrarily many qubits. Instead of 
assuming that there are infinitely many from the start, it is convenient to have an operation to add a new 
qubit, namely, add. To add a new qubit labeled X in the state |o) x , apply add^ to the current state. This 
operation can only be used if there is not already a qubit labeled X. In the bra-ket notation, we implement 
the add^ operation by multiplying the ket expression for the current state by |o) . 



2.6 Qubit Measurements 

In order to classically access information about the state of qubits we use the measurement operation 
meas. This is an intrinsically probabilistic process that can be applied to any extant qubit. For informa- 
tion processing, one can think of meas as a subroutine or function that returns either o or l as output. 
The output is called the "measurement outcome". The probabilities of the measurement outcomes are 
determined by the current state. The state of the qubit being measured is "collapsed" to the logical state 
corresponding to the outcome. Suppose we have just one qubit, currently in the state \vp) = a\o) + f3\i). 
Measurement of this qubit has the effect 

(o:|o) with probability \a\ 2 , 
(33) 
i:|i) with probability |/3| 2 . 

The classical output is given before the new state for each possible outcome. This measurement behavior 
explains why the amplitudes have to define unit length vectors: Up to a phase, they are associated with 
square roots of probabilities. 
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For two qubits the process is more involved. Because of possible correlations between the two qubits, 
the measurement affects the state of the other one too, similar to conditioning for pbits after one "looks" 
at one of them. As an example, consider the state 

U) =-|oi) +-I10) +-I00) . (34) 

1 r 'AB 3 1 'AB 3 1 'AB 3 1 'AB V ' 

To figure out what happens when we measure qubit A, we first rewrite the current state in the form 
a \°) I0o ) + /^l 1 ) |0i) > where |0 O ) and |0i) are pure states for qubit B. It is always possible to do 

A B A B B B 

that. For the example of Eq. 34: 




so a = ^p, /3 — y, \<j>o\ = 7gl°) B + "^l 1 ) B an( ^ l0i) B = l°) B - ^ ne ^ ast ste P rec l u i re d pulling out the factor 
of ^ to make sure that \(f>o) B is properly normalized for a pure state. Now that we have rewritten the state, 
the effect of measuring qubit A can be given as follows: 

C o:|o)J0 o ) b with probability \a\ 2 , 
meas( A )(a|o) A |0 o > B +/?|i»i> B ) = { (36) 

[ i:|i)J0i) B with probability |/3| 2 . 

For the example, the measurement outcome is with probability |, in which case the state collapses to 

(75 l°^ B 75 1 1 ^) ' ^ ne outcome i s 1 with probability |, in which case the state collapses to |i) a |o) b . 
The probabilities add up to 1 as they should. 

The same procedure works for figuring out the effect of measuring one of any number of qubits. Say 
we want to measure qubit B among qubits A, B, C, D, currently in state |^) ABCD - Fi rst rewrite the state in 
the form a l°) B l0o) ACD + /^I-OJ^i^cd' making sure that the ACD superpositions are pure states. Then the 
outcome of the measurement is with probability \a\ 2 and 1 with probability |/3| 2 . The collapsed states 
^ l°> B l0o> AfD and |i> B |0i> ACD , respectively. 

Probabilities of the measurement outcomes and the new states can be calculated systematically. For 
example, to compute the probability and state for outcome of meas^' given the state |^>) , one can first 
obtain the unnormalized ket expression \(f>' ) = A (°ll^) by using the rules for multiplying kets by bras. 
The probability is given by p = B (0 o l0o) > and the collapsed, properly normalized pure state is 

|o> A |0o> B /V^ = KHK'Uv^ (37) 

The operator P = |o) a (o| is called a "projection operator" or "projector" for short. If we perform the same 
computation for the outcome 1, we find the projector P x = |i) A (i|. The two operators satisfy P a 2 = P a , 
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Pj = P a and P + Pi = 1. In terms of the projectors, the measurement's effect can be written as follows: 

{ 0:P o\ip) AB / y/Po with probability p , 
(38) 
1:P M AB /y/Pi with probability pi, 

where p = AB (4-'\P \i ) \ B an d p\ = AB (4-'\Pi\4>\ B - I n quantum mechanics, any pair of projectors satisfying 
the properties given above is associated with a potential measurement whose effect can be written in the 
same form. This is called a binary "von Neumann", or "projective", measurement. 

2.7 Mixtures and Density Operators 

The measurement operation "reads out" information from qubits to pbits. What if we discard the pbit that 
contains the measurement outcome? The result is that the qubits are in a probabilistic "mixture" of two 
pure states. Such mixtures are a generalization of pure states. The obvious way to think about a mixture 
is that we have a probability distribution over pure quantum states. For example, after discarding the pbit 
and qubit A in Eq. 36, we can write the state of B as p = {|«| 2 :|0o) B > |/^| 2: l0i) B }> using the notation for 
probability distributions introduced earlier. 

Mixtures frequenty form when using irreversible operations such as measurement. Except for mea- 
surement, the quantum gates that we have introduced so far are reversible and therefore transform pure 
states to pure states, so that no mixtures can be formed. One of the fundamental results of reversible classi- 
cal and quantum computation is that there is no loss in power in using only reversible gates. Specifically, it 
is possible to change a computation that includes irreversible operations to one that accomplishes the same 
goal, has only reversible operations and is efficient in the sense that it uses at most polynomial additional 
resources. However, the cost of using only reversible operations is not negligible. In particular, for ease 
of programming, and more importantly, when performing repetitive error-correction tasks (see [12]), the 
inability to discard or reset qubits can be very inconvenient. We therefore introduce additional operations 
that enable resetting and discarding. 

Although resetting has a so-called "thermodynamic" cost (think of the heat generated by a computer), it 
is actually a simple operation. The reset operation applied to qubit A can be thought of as the result of first 
measuring A, then flipping A if the measurement outcome is |i), and finally discarding the measurement 
result. Using the notation of Eq. 36, the effect on a pure state |^) is given by: 

reset^l^ = {M 2 :|o»> B , l/3| 2 :|o»> B }. (39) 

To apply reset to an arbitrary probability distribution, you apply it to each of the distribution's pure 
states and combine the results to form an expanded probability distribution. The discard^ operation 
is reset ( A ' followed by discarding qubit A. Therefore, in the expression for the state after reset ( A ), all 
the |o) a are removed. It is an important fact that every physically realizable quantum operation, whether 
reversible or not, can be expressed as a combination of add operations, gates from the universal set and 
discard operations. 

The representation of mixtures using probability distributions over pure states is redundant. That is, 
there are many probability distributions that are physically indistinguishable. A non-redundant description 
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of a quantum state can be obtained by using "density operators". The density operator for the mixture p 
given in Eq. 39 is given by 

P=l«| 2 |0o> B B (0o| + |/5| 2 |0i> B B <0i|. (40) 

The general rule for calculating the density operator from a probability distribution is as follows: For 
each pure state |0) in the distribution, calculate the operators and sum them weighted by their 

probabilities. 

There is a way to apply gates to the density operators defined by states. If the gate acts by the unitary 
operator U, then the effect of applying it to p is given by UpW, where W is the conjugate transpose of 
U. (In the bra-ket expression for U, W is obtained by replacing all complex numbers by their conjugates, 
and terms such as |0)(y?|by 

The relationship between a qubit's state space and a sphere can be explained in terms of the qubit's 
density operators. In matrix form, this operator is a 2 x 2 matrix, which can be written uniquely as a sum 
(1 + xa x + ya y + za z )/2. One can check that if the density operator |V ; )(V ; I f° r a qubit's pure state is 
written as such a sum, 

\t/j){ip\ = (l + xcr x + ycry + Z(T z )/2, (41) 

then the vector (x, y, z) thus obtained is on the surface of the unit sphere in three dimensions. In fact, for 
every vector (x, y, z) on the unit sphere, there is a unique pure state satisfying Eq. 41. Since the density 
operators for mixtures are arbitrary, convex (that is probabilistic) sums of pure states, the set of (x, y, z) 
thus obtained for mixtures fills out the unit ball. The rotations introduced earlier modify the vector (x, y, z) 
in the expected way, by rotation of the vector around the appropriate axis. See [11] for more details. 

2.8 Quantum Computation 

The model of computation defined by the one- and two-qubit gates and the operations of adding (add), 
measuring (meas) and discarding (discard) qubits is called the "quantum network model". A sequence 
of instructions for applying these operations is called a "quantum network". Quantum computation extends 
the network model by providing a means for repeating blocks of instructions. Such means can be specified 
by a formal machine model of computation. There are several such models of classical and quantum 
computers. One of the best known is the Turing machine, which has a quantum analogue, the quantum 
Turing machine. This model has its uses for formal studies of computation and complexity, but is difficult 
to program. Fortunately, as mentioned in Sect. 1, there is no loss of computational power if the means 
for repeating instructions is provided by a classical computer that can apply gates and other operations to 
qubits. A general quantum algorithm is a program written for such a computer. 

There are three practical methods that can be used to write quantum networks and algorithms. The 
first is to use the names for the different operations and algebraically multiply them. The second is to draw 
quantum networks, which are pictorial representations of the sequence of steps in time, somewhat like 
flowcharts without loops. The third is to use a generic programming language enhanced with statements 
for accessing and modifying quantum bits. The first two methods work well as long as the sequence is 
short and we do not use many operations that depend on measurement outcomes or require loops. They 
are often used to describe subroutines of longer algorithms presented either in words or by use of the third 
method. 
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To see how to use the different methods and also to illustrate the power of quantum computation, we 
work out a short quantum algorithm that solves the following problem: 

The Parity Problem: Given is a "black box" quantum operation BB^ ABC ^ that has the following effect 
when applied to a logical basis state: 

BB( ABC )|aAa B > AB |ac> c = |a A aB> AB |ac © (b A a A © 6 B a B )> c , (42) 

where 6 a and 6b are or 1. The actual values of 6a and 6b are unknown. The problem is to determine 
what 6a and 6b are by using the black box only once. 

The terminology and the definition of the operation BB^ 60 ** require explanation. In computation, we 
say that an operation is a black box or an "oracle" if we have no access whatsoever to how the operation 
is implemented. In a black box problem, we are promised that the black box implements an operation 
from a specified set of operations. In the case of the parity problem, we know that the operation is 
to add one of four possible parities (see below). The problem is to determine which parity is added 
by using the black box in a network. Black box problems serve many purposes. One is to study the 
differences between models of computation, just as we are about to do. In fact, black box problems 
played a crucial role in the development of quantum algorithms by providing the first and most convincing 
examples of the power of quantum computers [13, 14]. Some of these examples involve generalizations of 
the parity problem. Another purpose of black box problems is to enable us to focus on what can be learned 
from the "input/output" behavior of an operation without having to analyze its implementation. This is 
useful because in many cases of interest, it is very difficult to exploit knowledge of the implementation 
to determine a desirable property of the operation. A classical example is the well-known satisfiability 
problem, in which we are given a classical circuit with one output bit and we need to determine whether 
there is an input for which the output is l. Instead of trying to analyze the circuit, one can look for and use 
a general purpose black-box search algorithm to find the "satisfying" input. 

In the definition of the effect of BB^ ABC \ the operation "©" is addition modulo 2, so 1 © 1 = 0, and 
all the other sums are as expected. As the state symbols now have a numeric meaning, we use the number 
font for states. To see what BB does, suppose that 6a and 6 B are both 1. Then BB adds (modulo 2) the 
parity of the logical state in AB to the logical state of C. The parity of a logical state is if the number of 
l's is even and 1 if it is odd. The action of BB for this example is given by: 

BB( abc >|00> ab |0) c = |00> ab |0) c 
BB( ABC )|01) A J0) C = |01> ab |0©1) c 

BB( abc >|10) ab |1) c = |10) ab |1©1) c 

= |io>Jo>c 

BB( abc >|11) ab |0> c = |11> ab |0) c (43) 

The action of the black box is extended to superpositions by "linear extension". This means that to apply 
BB to a superposition of the logical states, simply apply it to each logical summand and add the results. 
Different values of 6a and 6b correspond to different parities. For example, if 6a = 1 and 6b = 0, then the 
parity of the state in A is added to the state in C. In this sense, what is added is the parity of a subset of 
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the two qubits AB. Thus, one way of thinking about the problem is that we wish to find out which subset's 
parity the black box is using. 

We can give an algorithm that solves the parity problem using each of the three methods for describing 
quantum networks. Here is an algebraic description of a solution, qparity( ABC \ given as a product of 
quantum gates that involves one use of the black box. We defer the explanation of why this solution works 
until after we show how to represent the algorithm pictorially using quantum networks. 

qparit y ( ABC ) = meas( B )H( B Veas( A )H^BB( ABC )H( c )not( c )add( c )H( B )add( B )H( A )add( A ). 

(44) 

The output of the algorithm is given by the classical outputs of the measurements of qubit A, which yields 
6a, and qubit B, which yields 6 B . As is conventional, in writing products of linear operators, the order of 
application in Eq. 44 is right to left, as in a product of matrices applied to a column vector. This order 
of terms in a product is, however, counterintuitive, particularly for operations to be performed one after 
the other. It is therefore convenient to use left to right notation, as is done in describing laser or radio- 
frequency pulse sequences. One way to make it clear that left to right order is used involves putting dots 
between gates as in the following version of Eq. 44: 

qparity( ABC )=add( A ).H( A ).add( B ).H( B ).add( c Uot( c ).H( c ).BB( ABC ).H( A ).meas( A ).H( B \meas( B ). 

(45) 

In this representation, the first operation is add^, the second is H^ A ) (the Hadamard gate on qubit A) and 
so on. 

The algebraic specification of the algorithm as products of gates does not make it easy to see why the 
algorithm works. It is also difficult to see which operations depend on each other. Such dependencies 
are used to determine whether the operations can be "parallelized". Quantum networks make these tasks 
simpler. The quantum network for the above sequence is shown in Fig. 2. 
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FIG. 2: Quantum network for solving the parity problem. A quantum network has a (horizontal in this 
case) line for each qubit. The line can be thought of as the time-line for the qubit and is shown in blue. 
Each gate is drawn as a box, circle, or other element intercepting the lines of the qubits it acts on. In 
this case, time runs from left to right. Each qubit's time-line starts at the point where it is added. In this 
example, the qubits' time-lines end when they are measured, at which point a classical bit (brown time 
line) containing the measurement outcome is introduced. The operation BB is illustrated as a black box. 
The numbers underneath the network refer to checkpoints used to explain how the network solves the 
parity problem. 

To understand how the quantum network of Fig 2 solves the parity problem, we can follow the states 
as the network is "executed" from left to right, using the indicated checkpoints. Using vector notation for 
the states, at checkpoint 1 the state is 

i*>i=(<iM°Mo)' <46) 

where we used Kronecker product notation to denote the states of A, B and C, in this order. In the next 
time step, the network involves applying Hadamard gates (Eq. 13) to A and B and a not gate (Eq. 9) to C. 
At checkpoint 2, this operation results in the state 

Next, a Hadamard gate is applied to C, so that at checkpoint 3 we have, 



"*>-=u^Mi/^rUw <**> 

The next event involves applying the black box. To understand what happens, note that the effect of the 
black box can be described as "conditional on each logical state of AB, if the parity according to 6a and 
6b is 1, then apply not to C" The current state of C is such that if not is applied, only the sign changes: 



not 



l/y/2 \ /0 1\ / l/y/2 

-1/V2) \ 1 J I -l/y/2 



Now A B is in a superposition of each of the logical states, and conditional on the logical state and the 
(hidden) parity, the sign changes. As a result, although the state of C does not change, a phase is "kicked 
back" to AB. A generalization of this effect is at the heart of A. Kitaev's version of P. Shor's quantum 
factoring algorithm (Sect. 2.10). At the next checkpoint, and after some arithmetic to check which logical 
states change sign, we can write the state as 



20 



Notice that qubits A and B are in orthogonal states for different values of 6a, 6b- It suffices to apply the 
Hadamard transform again to A and B to get 




Measurements of A and B now reveal the previously unknown 6 a and 6 B . 

As can be seen, the visual representation of a quantum network eases the tasks of following what 
happens. This is why it is used extensively for presenting basic subroutines and algorithms in quantum 
computation. A guide to the commonly used network elements is given in Fig. 3. 
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FIG. 3: Quantum network elements. 



When designing or describing complicated algorithms for quantum computers, providing everything 
in terms of quantum networks can become difficult, particularly when an important part of the algorithm 
consists of computations that are best done on a classical computer. For example, a full description of 
Shor's algorithm for factoring whole numbers (see Sect. 2.10) includes a significant amount of classical 
preprocessing, which determines choices made in the quantum algorithm, and classical postprocessing, 
which computes a factor from the measured result by a continued fraction algorithm. For such algorithms, 
one can use a programming language similar to Pascal, BASIC or C enhanced with statements to access 
quantum bits and to apply quantum operations. For algorithm design, computer scientists often use a 
semi-formal language called "pseudocode" [8]. With a simple extension called "quantum pseudocode", 
the algorithm for the parity problem can be written as follows: 

BBparity(BB) 

Input: Access to a quantum black box BB that acts on three qubits by adding a parity function of the 
first two qubits to the third. 

Output: The two bits 6a and 6b of the parity function, 
foreach i e {A, B, C} 
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r aP <- |o> 

C: Initialize three one-qubit registers r a^, i = A, B, C. The corner bracket annota- 
tion declares di as a quantum register. 

end 

r a c n «- a x r a c n 
foreach i e {A, B, C} 

r aP <- H r ai n 
end 

r a n <- BB r a n 

C: r a n refers to the three qubit register consisting of the r a i ~ 1 
foreach i e {A, B} 

r aP «- H r ai n 

6j <— meas r aj n 
end 

return 6 A , &b 
end 



Any classical programming language can be extended with statements to access and manipulate quan- 
tum registers. 

Now that we have looked at the quantum solution of the parity problem, let us consider the question 
of the least number of black-box applications required by a classical algorithm: Each classical use of the 
black box can only give us one bit of information. In particular, one use of the black box with input oaOb 
reveals only the parity of oaOb according to the hidden parameters b/\ and be. Each use of the black box 
can therefore only help us distinguish between two subsets of the four possible parities. At least two uses 
of the black box are therefore necessary. Two uses are also sufficient: To determine which of the four 
parities is involved, use the black box first with input a A a B = 10 and then with input a^as = 01. As a 
result of this argument, one can consider the parity problem as a simple example of a case in which there 
is a more efficient quantum algorithm than is possible classically. However, it is worth noting that the 
comparison is not entirely fair: A truly classical oracle answering parity questions or implementing the 
black box on the states of classical bits is useless to a quantum algorithm. To take advantage of such an 
algorithm it must be possible to use superpositions that are not implicitly collapsed. Collapse can happen 
if the oracle makes a measurement or otherwise "remembers" the question that it was asked. 

2.9 Resource Accounting 

When trying to solve a problem using quantum information processing, an important issue is to determine 
what physical resources are available and how much of each resource is needed for the solution. As 
mentioned before, in classical information, the primary resources are bits and operations. The number of 
bits used by an algorithm is called its "space" requirement. The number of operations used is called its 
"time" requirement. If parallel computation is available, one can distinguish between the total number of 
operations ("work") and the number of parallel steps ("time"). 
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When quantum information processing is used, the classical resources are still relevant for running 
the computer that controls the quantum system and performs any pre- and post-processing tasks. The 
main quantum resources are analogous to the classical ones: "quantum space" is the number of qubits 
needed, and "quantum time" the number of quantum gates. Because it turns out that reset operations have 
a thermodynamic cost, one can count irreversible quantum operations separately. This accounting of the 
resource requirements of algorithms and of the minimum resources needed to solve problems forms the 
foundations of quantum complexity theory. 

As a simple example of resource accounting, consider the algorithm for the parity problem. No clas- 
sical computation is required to decide which quantum gates to apply, or to determine the answer from 
the measurement. The quantum network consists of a total of 1 1 quantum gates (including the add's and 
meas's operations) and one oracle call (the application of the black box). In the case of oracle problems, 
one usually counts the number of oracle calls first, as we have done in discussing the algorithm. The 
network is readily parallelized to reduce the time resource to 6 steps. 

2.10 From Factoring to Phase Estimation 

The publication of Shor's quantum algorithm for efficiently factoring numbers [4, 5] was the key event that 
stimulated many theoretical and experimental investigations of quantum computation. One of the reasons 
why this algorithm is so important is that the security of widely used public key cryptographic protocols 
relies on the conjectured difficulty of factoring large numbers. An elementary overview of these protocols 
and the quantum algorithm for breaking them is in [15]. Here, we outline the relationship between fac- 
toring and the powerful technique of phase estimation. This relationship helps in understanding many of 
the existing quantum algorithms and was first explained in [16], motivated by Kitaev's version [17] of the 
factoring algorithm. 

The factoring problem requires writing a whole number N as a product of primes. (Primes are whole 
numbers greater than 1 that are divisible without remainder only by 1 and themselves.) Shor's algorithm 
solves this problem by reducing it to instances of the order-finding problem, which will be defined below. 
The reduction is based on basic number theory and involves efficient classical computation. At the core of 
Shor's algorithm is a quantum algorithm that solves the order-finding problem efficiently. In this case, an 
algorithm is considered efficient if it uses resources bounded by a polynomial in the number of digits of 
N. For more information on the requisite number theory, see any textbook on number theory [18, 19]. 

We begin by showing that factoring reduces to order finding. The first observation is that to factor 
a whole number it is sufficient to solve the factor- finding problem, whose statement is: Given a whole 
number N find a proper factor of N, if one exists. A "factor" of N is a whole number / that satisfies 
N = fg for some whole number g. The factor / is "proper" if / ^ 1 and f ^ N. For example, if 
N = 15, then 3 and 5 are its proper factors. For some numbers it is easy to find a proper factor. For 
example, you can tell that N is even from the least significant digit (in decimal or binary), in which case 2 
is a proper factor (unless N = 2, a prime). But many numbers are not so easy. As an example, you can try 
to find a proper factor of N = 149573 by hand 1 . You can complete the factorization of a whole number 
by recursively applying an algorithm for the factor-finding problem to all the proper factors found. 

i 

I0t>«EZ.E=EAS6M 
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Before we continue the reduction of factoring to order finding, we briefly explain modular arithmetic, 
which both simplifies the discussion and is necessary to avoid computing with numbers that have expo- 
nential numbers of digits. We say that a and b are "equal modulo N", written as a = b mod N, if a — b 
is divisible by N (without remainder). For example, 3 = 18 mod 15 = 33 mod 15. Equality modulo N 
is well-behaved with respect to addition and multiplication. That is, if a = b mod N and c = d mod N, 
then a + c = b + d mod N and ac = bd mod N. For factoring N, we will be looking for whole numbers 
a that are divisible by a proper factor of N. If a has this property, then so does any b with b = a mod N. 
We therefore perform all arithmetic "modulo N". One way to think about this is that we only use whole 
numbers a that satisfy < a < N — 1. We can implement an arithmetic operation modulo iV by first ap- 
plying the operation in the usual way and then computing the remainder after division by N. For example, 
to obtain ab mod N, we first compute ab. The unique c such that < c < N — 1 and c = ab mod N is the 
remainder after division of ab by N. Thus c is the result of multiplying a by b modulo N. Consistent with 
this procedure, we can think of the expression a mod N as referring to the remainder of a after division 
by N. 

The second observation in the reduction of factoring to order finding is that it is sufficient to find a 
whole number r with the property that r 2 — 1 is a multiple of N but r — 1 and r + 1 are not. Using the 
language of modular arithmetic, the property is expressed as r 2 = 1 mod N but r ^ 1 mod N and r ^ 
— 1 mod N. Because 1 mod N and —1 mod N are the obvious square roots of 1 mod N, we say that r is 
a "non-trivial square root of unity" (modulo N). For such an r, one can write r 2 — 1 = (r — l)(r + l) = mN 
for some whole number m. This implies that every prime factor p of N divides either (r — 1) or (r + 1) 
so that either (r — 1) or (r + 1) is or shares a factor with N. Suppose that r — 1 is or shares such a factor. 
Because r — 1 is not a multiple of N, the greatest common divisor of r — 1 and N is a proper factor of 
N. Since there exists an efficient classical algorithm (the "Euclidean algorithm") for finding the greatest 
common divisor, we can easily find the desired proper factor. 

The examples of N = 15 and N = 21 serve to illustrate the key features of the algorithm. For N = 15, 
possible choices for r are r = 4 (4 2 — 1 = 1* 15) and r = 11 (ll 2 — 1 = 120 = 8 * 15). For the first 
choice, the proper factors emerge immediately: 4—1 = 3,4+1 = 5. For the second, it is necessary to 
determine the greatest common divisors. Let gcd(x, y) stand for the greatest common divisor of x and y. 
The proper factors are gcd(ll — 1, 15) = gcd(10, 15) = 5 and gcd(ll + 1, 15) = gcd(12, 15) = 3. For 
N = 21, one can take r = 8, as 8 2 — 1 = 63 = 3 * 21. In this case, 8 — 1 = 7 is a proper factor and 
gcd(8 + 1, 21) = 3 is another. 

For iV even or a power of a prime it is not always possible to find a non-trivial square root of unity. 
Because both of these cases can be handled efficiently by known classical algorithms, we can exclude 
them. In every other case, such numbers r exist. One way to find such an r is to start from any whole 
number q with 1 < q < N. If gcd(g, N) = 1, then according to a basic result in number theory there 
is a smallest whole number k > 1 such that q k — 1 = mod N. The number k is called the "order" of 
q modulo N. If k is even, say k = 21, then (q 1 ) 2 = 1 mod N, so q l is a (possibly trivial) square root 
of unity. For the example of iV = 15, we can try q = 2. The order of 2 modulo 15 is 4, which gives 
r = 2 2 = 4, the first of the two choices in the previous paragraph. For N = 21, again with q = 2, the 
order is 6: 2 6 — 1 = 63 = 3 * 21. Thus, r = 2 3 = 8. We can also try q = 11, in which case with foresight it 
turns out that ll 6 — 1 is divisible by 21. A possible problem appears, namely, the powers q k that we want 
to compute are extremely large. But modular arithmetic can be used to avoid this problem. For example, 
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to find the order of 11 modulo 21 by a direct search, we can perform the following computation: 



ll 2 
ll 3 



121 
11 * ll 2 



5*21 + 16 



11 * 16 mod 21 
—55 mod 21 
11 * 8 mod 21 
11* 4 mod 21 
11 * 2 mod 21 



16 mod 21 



11 * (-5) mod 21 
3 * 21 + 8 mod 21 
4*21 +4 mod 21 



ll 4 
ll 5 
ll 6 



11 * ll 3 
11 * ll 4 
11 * ll 5 



8 mod 21 
4 mod 21 



2 mod 21 
1 mod 21 



(52) 



In general such a direct search for the order of q modulo N is very inefficient, but as we will see, there is 
an efficient quantum algorithm that can determine the order. 

A factor- finding algorithm based on the above observations is the following: 

FACTORFlND(iV) 

Input: A positive, non-prime whole number N. 

Output: A proper factor / of N, that is / is a whole number such that 1 < f < N and N = fg for some 
whole number g. 

1. If N is even, return / = 2. 

2. If N = p k for p prime, return p. 

3. Randomly pick 1 < q < N — 1. 

3. a. If / = gcd(g, N) > 1 return /. 

4. Determine the order k of q modulo N using the quantum order-finding algorithm. 

4. a. If k is not even, repeat at step 3. 

5. Write k = 21 and determine r = q l mod N with 1 < r < N. 

5. a. If 1 < / = gcd(r - 1, N) < N, return /. 
5.b. If 1 < / = gcd(r + 1,N)< N, return /. 

5.c. If we failed to find a proper factor, repeat at step 3. 

The efficiency of this algorithm depends on the probability that a randomly chosen q at step 3 results 
in finding a factor. By using an analysis of the group of numbers q that satisfy gcd(g, N) = 1, it can be 
shown that this probability is sufficiently large. 

The main problem that remains to be solved is that of finding the order of q mod N. A direct search 
for the order of q mod N involves computing the sequence 



This sequence can be conveniently visualized as a cycle whose length is the order of q mod N (Fig. 4). 



1 — > q — > q 2 mod N 



q k ~ l mod N — ► 1 = q k mod N. 



(53) 
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The cycle of q mod N 



The cycle of 8 mod 15 



q k 1 mod N 



q k 2 mod A" q' 2 mod A" 



FIG. 4: Multiplicative cycles of q mod N. Each number on a cycle is obtained from the previous one by 
multiplication by q mod N. 

To introduce the quantum algorithm, we first associate the logical quantum states |0), |1), . . . \N — 1) 
with the numbers 0, 1, . . . , N — 1. The map / which takes each number on the cycle to the next number 
along the cycle is given by f(x) = qx mod N. For q satisfying gcd(g, N) = 1, the map / permutes not 
only the numbers on the cycle, but all the numbers modulo N. As a result, the linear operator / defined 
by f\x) = \f(x)) = \qx mod N) is unitary. The quantum algorithm deduces the length of the cycle 
for q by making measurements to determine properties of the action of / on superpositions of the states 
\q s mod N). To illustrate the basic ideas, we work out the example of A^ = 15 and q = 8. The action of 
/ on the states |1), |8), |4), |2) in the cycle of 8 mod 15 is completely determined by the eigenstates and 
eigenvalues of /. For cyclicly acting permutations, a basis of eigenstates is given by the "Fourier" basis 
for the space spanned by the states in a cycle. For the cycle of interest, the Fourier basis consists of the 
states 

M = |8> + |4)+ |2>^ 

M = M|i>+;|8>-|4>-W 

l^ 2 > = H|l>- |8)+|4>- |2)| 
l^ 3 > = §(|1M|8>-|4>W 



(54) 



The phases of the /'th state of the cycle occurring in the sum for can be written as i lm . It follows that 
fl^m) = i m \ipm) > that is, the eigenvalue of / for \ip m ) is i m . Note that in the complex numbers, the powers 
of i are all the fourth roots of unity. In general, the Fourier basis for the cycle ... — > \q l mod N) —>■... 
consists of the states \il> m ) = J2i ujlm W m °d N), where cu = e %2lT / k is a primitive fc'th root of unity. (The 
complex number x is a primitive A;'th root of unity if k is the smallest whole number k > such that 
x k = 1. For example, both —1 and i are fourth roots of unity, but only i is primitive.) 
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It is, of course, possible to express the logical state 1 1) using the Fourier basis: 

\l) = l(Ho) + M + M + M)- (55) 

The key step of the quantum algorithm for order finding consists of a measurement to estimate a random 
eigenvalue of / whose associated eigenstate occurs in the expression for |1) in terms of the Fourier basis. 
If the eigenvalue found is a primitive /c'th root of unity, we infer that the cycle length is divisible by k and 
check (using a classical algorithm) whether this is the order of q. In the example, the random eigenvalues 
are 1 (the only primitive first root of unity), i and — % (primitive fourth roots of unity) and —1 (the primitive 
second root of unity). The order is found if the random eigenvalue is a primitive fourth root of unity, which 
happens with probability 1/2 in this case. 

The quantum algorithm for obtaining an eigenvalue is called the "phase estimation" algorithm. It 
exploits a more general version of the phase kick back we encountered in the solution of the parity problem. 
The phase kick back transfers the eigenvalue of an eigenstate of / to a Fourier basis on a number of 
additional qubits called "helper" or "ancilla" qubits. Which Fourier state results is then determined by 
a subroutine called the "measured quantum Fourier transform". We introduce these elements in the next 
paragraphs. Their combination for solving the general order-finding problem is shown in Fig. 10. 

Fig. 5 shows how to kick back the eigenvalue of an eigenstate of / using a network implementing the 
controlled-/ operation. 




FIG. 5: Phase estimation with one qubit. The input is a product state on one ancilla qubit and on a second 
quantum system as shown. The state \ip m ) on the second system is an eigenstate of /. For the example 
under discussion (see Eq. 54), the eigenvalue is i m . A controlled-/ operation is applied to the input, that is, 
/ is applied to the second system conditional on |i) for the ancilla qubit. In the bra-ket notation, the total 
operation can be written as |o) (o| + |i) (i|/ (system labels have been omitted). Since / changes only the 
phase of its input, the second system is unchanged, but the phase modifies the ancilla qubit's superposition 
as shown. 

The network in Fig. 5 can be used with input 1 1) on the second system. From Eq. 55 and the superposition 
principle, it follows that the output correlates the different phase kickback states with the four eigenvectors 



28 



\ipm)- That is, the network implements the following transformation: 



2V2 



(|o> + |i» 



V 



+ l^2> 

+ MJ 



/(|o>+i°|i»M,> 
+ (|o>+i 1 |i»l^i> 



2V2 



\ 



+ (|0>+i 2 |l»|^2> 



V 



+ (|o> 



1° 1 



(56) 
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The hope is that a measurement of the first qubit can distinguish between the four possible phases that can 
be kicked back. However, because the four states are not mutually orthogonal, they are not unambiguously 
distinguishable by a measurement. To solve this problem, we use a second qubit and a controlled-/ 2 as 
shown in Fig. 6. 




FIG. 6: Phase estimation with two qubits. Using two qubits ensures distinguishability of the eigenvalues 
of / for the states \ip m ). The states of the input qubits are used to represent the numbers from to 3 in 
binary. The most significant bit (the "two"'s digit in the binary representation) is carried by the top qubit. 
That is, we make the following identification: |0) = |oo), |1) = |oi), |2) = |io) and |3) = |n). It 
follows that the network has the effect of applying f m conditional on the input qubits' logical state being 
|m). 

The four possible states \u m ) that appear on the ancilla qubits in the network of Fig. 6 are the Fourier 
basis for the cycle 0— ► 1 — > 2 — > 3 — > and are therefore orthonormal. If we apply the network of Fig. 6 
with |1) instead of \i/j m ) at the lower input, the output correlates the four \i/j m ) in the superposition with 
the \u m ), which makes the information about the eigenvalues of / available in the Fourier basis of the two 
ancilla qubits. This approach has the advantage that the states are known, whereas in the Fourier basis for 
the cycle of q mod N, the states depend on the numbers in the cycle, which are not known in advance 
(except in very simple cases, such as the example we are working with). 

To learn one of the eigenvalues of /, the last step is to make a "measurement in the Fourier basis". For 
one qubit representing the binary numbers and 1, the Fourier basis is -j= (|0) + |1)) and 4= (|0) — |1)), 
which is constructed as discussed after Eq. 54, but using the square root of unity uj = — 1 instead of the 
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fourth root i. To make a measurement that determines which of the two basis vectors is present, it suffices 
to apply the Hadamard transform H and make a standard measurement, just as we did twice in the network 
of Fig. 2. A more complicated network works with two qubits representing the binary numbers from to 
3. Such a network is shown in Fig. 7. 



|^2*ai+a ) < 




FIG. 7: The measured quantum Fourier transform [20] on two qubits representing the numbers 0, 1, 2, 3. 
If the input is one of the Fourier states \ u a ), where the binary digits of a are determined by a = 2 * a\ + a , 
then the measurement outcomes are ao and a\, as shown. The numbers under the network are checkpoints 
used for analyzing the network. 



To see how the network extracts the bits in the index of \u a ), we can follow the states as the network is 
executed. The input state at checkpoint 1 in Fig. 7 is given by 



A°* a |0> \ 



I0i> 



K> = 2 



:l*a 



1) 



■2*a 



|2> 



V 



■3*a 



i3>y 



/^(0*2 1 +0*2°)(ai*2 1 +ao*2°) | 0£ ^ \ 
_|_^(0*2 1 +i*2 )(ai*2 1 +a *2 )| 01 ^ 
_|_^(l*2 1 +0*2°)(ai*2 1 +a *2°) | 10 ^ 
\^ _^^(l*2 1 +l*2°)(ai*2 1 +a *2 )| 11 ^ J 



(57) 



In the last sum, the relevant numbers have been fully expanded in terms of their binary digits to give a 
flavor of the general principles underlying the measured Fourier transform. The next step of the network 
applies a Hadamard gate to the qubit carrying the most significant digit. To understand how it succeeds 
in extracting a , the least significant bit of a, let b with binary digits b and bi represent one of the logical 
states of the two qubits. As before, the most significant bit bi is represented by the top/first qubit that the 
first Hadamard gate is applied to. The phase of \b) in Eq. 57 is given by ^C'i*2 1 +bo*2 )(a 1 *2 1 +ao*2 ) Nextj 
we determine how this phase depends on b\\ 



■(fei*2 1 +bo*2 )(ai*2 1 +a *2°) 



•6l*2 1 *(ai*2 1 +a *2°) •&o*2°*(ai*2 1 +a *2 ) 

t i 

•6i*ai*2 2 •&i*ao*2 1 •fro*2°*(ai*2 1 +ao*2°) 

Lb t 



30 



= (i 4 ) 6l * ai (2 2 ) 6l * a ° ^ b O*2 *(ai*2 1 +ao*2 ) 

_ / -j\6i*ao ^6o*2°*(ai*2 1 +ao*2°) (58) 

It follows that if a = 0, the phase does not depend on b\, and if a = 1, it changes sign with b\. This sign 
change can be detected by performing the Hadamard transform and measuring, as can be seen explicitly 
by computing the state after the Hadamard transform at checkpoint 2: 

= _L ^0*20*^*21+00*2°) | flo) | o) + .1*20*^*21+00*2°) | aQ ) 
= | aQ ^ J_ ^0*2°*(a 1 *2 1 +a *2°)| o ^ + ^l*2 () *(ai*2 1 +a *2°)| 1 ^ N J 

The phases still show a dependence on ao via the terms i b °* 2 * a °* 2 = i b ° a °. The purpose of the phase shift 
gate conditioned on the measurement outcome is to remove that dependence. The result is the following 
state on the remaining qubit at checkpoint 3: 




The final Hadamard transform followed by a measurement therefore results in the bit a±, as desired. 

The elements that we used to determine the order of 8 modulo 15 can be combined and generalized to 
determine the order of any q modulo N with gcd(g, N) = 1. The general network is shown in Fig. 10. Two 
features of the generalization are not apparent from the example. First, in order for the quantum network 
to be efficient, an efficient implementation of the controlled p l operation is required. To obtain such an 
implementation, first note that to calculate f 2 (x) = q 2 x mod N it suffices to square q repeatedly modulo 
N using (q 2 ™) mod N = q 2m+1 mod N until we obtain q 2 ' mod N. The result is then multiplied by 
x mod N. This computation is efficient. For any given q, it can be converted to an efficient network 
consisting of Toffoli and controlled-not gates acting on the binary representation of x. The conversion can 
be accomplished with standard techniques from the theory of reversible classical computation. The result 
is an efficient network for f 2 . Basic network theory can then be used to implement the controlled version 
of this operation [21]. 

The understand the second feature, note that we were lucky that the order of 8 modulo 15 was a power 
of 2, which nicely matched the measured Fourier transform we constructed on two qubits. The measured 
Fourier transform on m ancilla qubits can detect exactly only eigenvalues that are powers of the 2 m 'th 
root of unity e™! 2 ™ . The phase kicked back by the controlled operations corresponds to a /c'th root of 
unity. Given a Fourier state on the cycle of q mod N, the resulting state on the ancilla qubits has phases 
that go as powers of a fc'th root of unity. Fortunately, the ancilla's Fourier basis is such that the measured 
Fourier transform picks up primarily those basis states whose generating phase is close to the kick back 
phase. Thus we are likely to detect a nearby to = e*' 77 / 2 ™ . It is still necessary to infer (a divisor of) k 
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from knowledge of such an u. Since we know that the order k is bounded by N, the number of possible 
phases kicked back that are near the measured uj is limited. To ensure that there is only one possible such 
phase, it is necessary to choose m such that 2 m > N 2 . See also the caption of Fig. 10. 
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FIG. 9: Network for quantum order finding and phase estimation. The number m of qubits used for the phase 
kick back has to be chosen such that m > 2 * log 2 (£;„), where k u is a known upper bound on the order k 
of q mod N. Because N > k, one can set m = 2|~log 2 (iV)], where \x\ is the least whole number s > x. 
There is an eigenvalue A/ = e l 2l7T ^ k of one of the Fourier eigenvectors associated with the cycle of q mod N 
such that the number a whose binary digits are the measurement outcomes satisfies e'W 2 '" « e i2ni/k_ jy[ ore 
precisely, with probability above .405, there exists I such that \a/2 m — l/k\ < l/2 m+1 [16]. Since any two 
distinct rational numbers with denominator at most k u differ by at least \jk\ > 2/2 m+1 , the theory of rational 
approximations guarantees that we can uniquely determine the number l/k. There is an efficient classical 
algorithm based on continued fractions that computes r and s with r/s = l/k and s = k/gcd(l,k). The 
probability that gcd(Z, k) = 1 is at least l/(log 2 (fc) + 1), in which case we learn that s = k, and this is the 
order of q mod N. Note that the complexity of the network depends on the complexity of implementing the 
controlled f 2 operations. Because these operations can be implemented efficiently, the network and hence 
the determination of the order of q mod N are efficient in the sense that on average, polynomial resources in 
log 2 (iV) suffice. 
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FIG. 10: Network for quantum order finding and phase estimation. The number m of qubits used for the 
phase kick back has to be chosen such that m > 2 * \og 2 (k u ), where k u is a known upper bound on the order 
k of q mod N. Because N > k, one can set m = 2 |~log 2 (iV)~| , where \x~\ is the least whole number s > x. 
There is an eigenvalue A; = e l 2ln ' k of one of the Fourier eigenvectors associated with the cycle of q mod N 
such that the number a whose binary digits are the measurement outcomes satisfies e* 7 ™/ 2 ™ ~ ^-Ki/k^ 
More precisely, with probability above .405, there exists / such that \a/2 m — l/k\ < l/2 m+1 [16]. Since 
any two distinct rational numbers with denominator at most k u differ by at least 1/k 2 > 2/2 m+1 , the 
theory of rational approximations guarantees that we can uniquely determine the number I / k. There is 
an efficient classical algorithm based on continued fractions that computes r and s with r/s = l/k and 
s = fc/gcd(Z, k). The probability that gcd(/, k) — 1 is at least l/(log 2 (/c) + 1), in which case we learn 
that s = k, and this is the order of q mod N. Note that the complexity of the network depends on the 
complexity of implementing the controlled f 2 operations. Because these operations can be implemented 
efficiently, the network and hence the determination of the order of q mod iV are efficient in the sense that 
on average, polynomial resources in log 2 (A r ) suffice. 
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3 Advantages of Quantum Information 



The notion of quantum information as explained in this primer was established in the 1990s. It emerged 
from research focused on understanding how physics affects our capabilities to communicate and to pro- 
cess information. The recognition that usable types of information need to be physically realizable was 
repeatedly emphasized by R. Landauer who proclaimed that "information is physical" [22]. Beginning 
in the 1960s, R. Landauer studied the thermodynamic cost of irreversible operations in computation [23]. 

C. Bennett showed that by using reversible computation, this cost can be avoided [24]. Limitations of mea- 
surement in quantum mechanics were investigated early by researchers such as J. von Neumann [25, 26], 
and later by A. Holevo [27] and C. Helstrom [28]. A. Holevo introduced the idea of quantum communica- 
tion channels and found bounds on their capacity for transmitting classical information [29]. Initially, most 
work focused on determining the physical limitations placed on classical information processing. The fact 
that pairs of two-level systems can have correlations not possible for classical systems was proven by 
J. Bell [30] in 1964. Subsequently, indications that quantum mechanics offers advantages to information 
processing came from S. Wiesner's studies of cryptographic applications [1] in the late 1960s. S. Wiesner's 
work was not recognized until the 1980s, when C. Bennett, G. Brassard, S. Breidbart and S. Wiesner [2] 
introduced the idea of quantum cryptography, which can be used to communicate in secret. 

Initially, the term "quantum computation" was mostly used to refer to classical computers realized 
using quantum mechanical systems. In the 1980s, R Benioff [31], R. Feynman [3] and Y. I. Manin [32] 
introduced the idea of a quantum computer based on quantum information. They noted that the apparent 
exponential complexity of simulating quantum mechanics on a classical computer might be overcome if 
we could use a computer that is itself based on quantum mechanics. A formal model of quantum Tur- 
ing machines was soon defined by D. Deutsch [33], who later also introduced quantum networks [34]. 

D. Deutsch and R. Jozsa [35] were the first to introduce a black box problem that can be solved determin- 
istically on a quantum computer in fewer steps than on a classical computer. 

In spite of suggestions that it could lead to large efficiency improvements in simulating physics, quan- 
tum information processing was still largely an academic subject. Based on work by E. Bernstein and 
U. Vazirani [13] that formalized quantum complexity theory, D. Simon [14] showed that, for black-box 
problems, quantum computers can be exponentially more efficient than classical deterministic or proba- 
bilistic computers, giving the first indication of a strong advantage for quantum information processing. It 
was Shor's algorithm for factoring large whole numbers [4, 5] that finally convinced a larger community 
that quantum information was more than just a tool for realizing classical computers. This change in at- 
titudes was in no small part due to the fact that the security of commonly used cryptographic protocols is 
based on the hardness of factoring. 

At that point, it was still generally believed that the fragility of quantum states made it unlikely for 
reasonably large quantum computers to be realized in practice. But the discovery by Shor [36] and 
A. Steane [37] that quantum error-correction was possible soon changed that view, see [12] for an in- 
troductory overview. 

As a result of the recognition of the utility and realizability of quantum information, the science of 
quantum information processing is a rapidly growing field. As quantum information becomes increasingly 
accessible by technology, its usefulness will be more apparent. The next few sections briefly discuss what 
we currently know about applications of quantum information processing. A useful reference text on 
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quantum computation and information with historical notes is the book by M. Nielsen and I. Chuang [38]. 
3.1 Quantum Algorithms 

Shor's factoring algorithm, which precipitated much of the current work in quantum information pro- 
cessing, is based on a quantum realization of the fast Fourier transform. The most powerful version of 
this technique is now represented by the phase-estimation algorithm of A. Kitaev [17] as formalized by 
R. Cleve etal. [16]. See Sect. 2.10 for an explanation of the factoring algorithm and phase estimation. The 
best known application of quantum factoring is to cryptanalysis, where it can be used to efficiently break 
the currently used public-key cryptographic codes. Whether there are any constructive applications of 
quantum factoring and its generalizations remains to be determined. For users of public key cryptography, 
a crucial question is: "How long can public key codes based on factoring continue to be used safely?" 
To attempt to answer this question, one can note that to break a code with a typical key size of 1000 bits 
requires more than 3000 qubits and 10 8 quantum gates, which is well out of reach of current technology. 
However, it is conceivable that a recording of encrypted information transmitted in 2000 can be broken in 
the next "few" decades. 

Shor's quantum factoring algorithm was not the first with a significant advantage over classical algo- 
rithms. The first quantum algorithms to be proposed with this property were algorithms for simulating 
quantum mechanical systems. These algorithms simulate the evolution of a reasonably large number of 
interacting quantum particles, for example, the electrons and nuclei in a molecule. The algorithms' outputs 
are what would be measurable physical quantities of the system being simulated. The known methods for 
obtaining these quantities on classical computers scale exponentially with the number of particles, except 
in special cases. 

The idea of using quantum computers for simulating quantum physics spurred the work that eventually 
lead to the quantum factoring algorithm. However, that idea did not have the broad scientific impact 
that the quantum factoring algorithm had. One reason is that because of its cryptographic applications, 
factoring is a heavily studied problem in theoretical computer science and cryptography. Because so 
many people have tried to design efficient algorithms for factoring and failed, the general belief that 
factoring is hard for classical computers has a lot of credibility. In contrast, the problem of quantum 
physics simulation has no simple formulation as an algorithmic problem suitable for study in theoretical 
computer science. Furthermore, many researchers still believe that the physically relevant questions can be 
answered with efficient classical algorithms, requiring only more cleverness on the part of the algorithms 
designers. Another reason for the lack of impact is that many of the fundamental physical quantities 
of interest are not known to be efficiently accessible even on quantum computers. For example, one of 
the first questions about a physical system with a given Hamiltonian (energy observable), is: What is 
the ground state energy? It is known that the ability to efficiently answer this question for physically 
reasonable Hamiltonians leads to efficient algorithms for hard problems such as the traveling salesman or 
the scheduling problems. In spite of occasional claims to the contrary, an efficient quantum solution to 
these problems is widely considered unlikely. 

Most quantum algorithms for physics simulations are based on a direct emulation of the evolution of 
a quantum mechanical system. The focus of the original proposals by Feynman and others was on how 
to implement the emulation using a suitable formulation of general-purpose quantum computers. After 
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such computers were formalized by Deutsch, the implementation of the emulation was generalized and 
refined by S. Lloyd [39], Wiesner [40] and C. Zalka [41]. The ability to emulate the evolution of quan- 
tum systems is actually widely used by classical "Monte-Carlo" algorithms for simulating physics, where 
the states amplitudes are, in effect, represented by expectations of random variables that are computed 
during the simulation. As in the case of the quantum algorithms for physics emulation, the Monte-Carlo 
algorithms efficiently evolve the representation of the quantum system. The inefficiency of the classical 
algorithm arises only in determining a physical quantity of interest. In the case of Monte-Carlo algorithms, 
the "measurement" of a physical quantity suffers from the so-called "sign problem", often resulting in ex- 
ponentially large, random errors that can be reduced only by repeating the computation extremely many 
times. In contrast, the quantum algorithms for emulation can determine many (but not all) of the in- 
teresting physical quantities with polynomially bounded statistical errors. How to efficiently implement 
measurements of these quantities has been the topic of more recent work in this area, much of which is 
based on variants of the phase estimation algorithm [42, 43, 44, 45, 46]. 

Although several researchers have suggested that there are interesting quantum physics simulations 
that can be implemented with well below 100 qubits, one of the interesting problems in this area of 
research is to come up with a specific simulation algorithm that uses small numbers of qubits and quantum 
gates, and that computes an interesting physical quantity not easily obtainable using available classical 
computers. 

Another notable algorithm for quantum computers, unstructured quantum search, was described by 
L. Grover [6]. Given is a black box that computes a binary function / on inputs x with < x < N. 
The function / has the property that there is a unique input a for which f(a) = 1. The standard quantum 
version of this black box implements the transformation = \x)\b © f(x)), where b is a bit and 

b © f(x) is computed modulo 2. Unstructured quantum search finds a quadratically faster, that is, in 
time of order N 1 / 2 , than the best classical black-box search, which requires time of order N. The context 
for this algorithm is the famous P ^ NP conjecture, which is captured by the following algorithmic 
problem: Given is a classical circuit C that computes an output. Is there an input to the circuit for which 
the circuit's output is l? Such an input is called a "satisfying" input or "assignment". For any given input, 
it is easy to check the output, but an efficient algorithm that finds a satisfying input is conjectured to be 
impossible. This is the P ^ NP conjecture. Generalizations of Grover's search algorithm ("amplitude 
amplification" [47]) can be used to find satisfying inputs faster than the naive classical search, which 
tries each possible input in some, possibly random, order. It is worth noting, howoever, that if sufficient 
classical parallelism is available, quantum search loses many of its advantages. 

The three algorithms just described capture essentially all the known algorithmic advantages of quan- 
tum computers. Almost all algorithms that have been described are applications of phase estimation or of 
amplitude amplification. These algorithms well justify developing special purpose quantum information 
processing technology. Will general purpose quantum computers be useful? More specifically, what other 
algorithmic advantages do quantum computers have? 

3.2 Quantum Communication 

Quantum communication is an area in which quantum information has proven (rather than conjectured) 
advantages. The best known application is quantum cryptography, whereby two parties, Alice and Bob, 
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can generate a secret key using a quantum communication channel (for example, photons transmitted 
in optical fiber) and an authenticated classical channel (for example, a telephone line). Any attempt at 
learning the key by eavesdropping is detected. A quantum protocol for generating a secret key is called 
a "quantum key exchange" protocol. There are no equally secure means for generating a secret key by 
using only classical deterministic channels. Few quantum operations are needed to implement quantum 
key exchange, and as a result there are working prototype systems [48, 49, 50]. To overcome the distance 
limitations (tens of kilometers) of current technology requires the use of quantum error-correction and 
hence more demanding quantum technology. 

Quantum key exchange is one of an increasing number of multi-party problems that can be solved more 
efficiently with quantum information. The area of research concerned with how several parties at different 
locations can solve problems while minimizing communication resources is called "communication com- 
plexity". For quantum communication complexity (R. Cleve and H. Burhman [51]), the communication 
resources include either shared entangled qubits or a means for transmitting quantum bits. A seminal paper 
by Burhman, Cleve and W. Van Dam [52] shows how the non-classical correlations present in maximally 
entangled states lead to protocols based on such states that are more efficient than any classical determin- 
istic or probabilistic protocol achieving the same goal. R. Raz [53] showed that there is an exponential 
improvement in communication resources for a problem in which Alice and Bob have to answer a question 
about the relationship between a vector known to Alice and a matrix known to Bob. Although this problem 
is artificial, it suggests that there are potentially useful advantages to be gained from quantum information 
in this setting. 

3.3 Quantum Control 

According to G. Moore's law of semiconductor technology, the size of transistors is decreasing exponen- 
tially, by a factor of about .8 every year. If this trend continues, then over the next few decades devices 
will inevitably be built whose behavior will be primarily quantum mechanical. For the purpose of classical 
computation, the goal is to remove the quantum behavior and stabilize classical information. But quantum 
information offers an alternative: It is possible to directly use the quantum effects to advantage. Whether 
or not this advantage is useful (and we believe it is), the ideas of quantum information can be used to 
systematically understand and control quantum mechanical systems. 

The decreasing size of semiconductor components is a strong motivation to strive for better under- 
standing the behavior of condensed matter quantum mechanical systems. But there is no reason to wait for 
Moore's law: There are a rapidly increasing number of experimental systems in which quantum mechan- 
ical effects are being used and investigated. Examples include many optical devices (lasers, microwave 
cavities, entangled photon pairs), nuclear magnetic resonance with molecules or in solid state, trapped ion 
or atom systems, Rydberg atoms, superconducting devices (Josephson junctions, SQUIDs) and spintronics 
(electron spins in semiconductor devices). Many of these systems are being considered as candidates for 
realizing quantum information processing. Yet, regardless of the future of quantum information process- 
ing, there is ample motivation for studying these systems. 
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3.4 Outlook 



The science of quantum information processing is promising to have a significant impact on how we 
process information, solve algorithmic problems, engineer nano-scale devices and model fundamental 
physics. It is already changing the way we understand and control matter at the atomic scale, making 
the quantum world more familiar, accessible and understandable. Whether or not we do most of our 
everyday computations by using the classical model, it is likely that the physical devices that support these 
computations will exploit quantum mechanics and integrate the ideas and tools that have been developed 
for quantum information processing. 
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4 Glossary 



Algorithm. A set of instructions to be executed by a computing device. What instructions are available 
depends on the computing device. Typically, instructions include commands for manipulating the 
contents of memory and means for repeating blocks of instructions indefinitely or until a desired 
condition is met. 

Amplitude. A quantum system with a chosen orfhonormal basis of "logical" states \i) can be in any 
superposition ^jQ!j|i) of these states, where ^ \cei\ 2 = 1. In such a superposition, the complex 
numbers aij are called the amplitudes. Note that the amplitudes depend on the chosen basis. 

Ancillas. Helper systems used to assist in a computation involving other information systems. 

Bell basis. For two qubits A and B, the Bell basis consists of the four states ^= (|oo) ab ± |n) AB ) and 

Bell states. The members of the Bell basis. 

Bit. The basic unit of deterministic information. It is a system that can be in one of two possible states, o 
and 1. 

Bit sequence. A way of combining bits into a larger system whose constituent bits are in a specific order. 
Bit string. A sequence of o's and l's that represents a state of a bit sequence. Bit strings are the words of 
a binary alphabet. 

Black box. A computational operation whose implementation is unknown. Typically, a black box imple- 
ments one of a restricted set of operations, and the goal is to determine which of these operations 
it implements by using it with different inputs. Each use of the black box is called a "query". The 
smallest number of queries required to determine the operation is called the "query complexity" of 
the restricted set. Determining the query complexity of sets of operations is an important problem 
area of computational complexity. 

Bloch sphere. The set of pure states of a qubit represented as points on the surface of the unit sphere in 
three dimensions. 

Bra. A state expression of the form which is considered to be the conjugate transpose of the ket 
expression |^). 

Bra-ket notation. A way of denoting states and operators of quantum systems with kets (for example, 
\ip)) and bras (for example, 

Circuit. A combination of gates to be applied to information units in a prescribed order. To draw circuits, 
one often uses a convention for connecting and depicting gates. See also "network". 

Circuit complexity. The circuit complexity of an operation on a fixed number of information units is the 
smallest number of gates required to implement the operation. 

Classical information. The type of information based on bits and bit strings and more generally on words 
formed from finite alphabets. This is the information used for communication between people. Clas- 
sical information can refer to deterministic or probabilistic information, depending on the context. 

Computation. The execution of the instructions provided by an algorithm. 

Computational states. See the entry for "logical states". 

Computer. A device that processes information. 

Density matrix or operator. A representation of pure and mixed states without redundancy. For a pure 
state |^), the corresponding density operator is \vfj) {vp\. A general density operator is a probabilistic 
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combination ^ with £) 4 A; = 1. 

Deterministic information. The type of information that is based on bits and bit strings. Deterministic 
information is classical, but it explicitly excludes probabilistic information. 

Distinguishable states. In quantum mechanics, two states are considered distinguishable if they are or- 
thogonal. In this case, a measurement exists that is guaranteed to determine which of the two states 
a system is in. 

Efficient computation. A computation is efficient if it requires at most polynomially many resources as 
a function of input size. For example, if the computation returns the value f(x) on input x, where 
x is a bit string, then it is efficient if there exists a power k such that the number of computational 
steps used to obtain f(x) is bounded by \x\ k , where |x| is the length (number of bits) of x. 

Entanglement. A non-classical correlation between two quantum systems most strongly exhibited by the 
maximally entangled states such as the Bell states for two qubits, and considered to be absent in 
mixtures of product states (which are called "separable" states). Often states that are not separable 
are considered to be entangled. However, nearly separable states do not exhibit all the features of 
maximally entangled states. As a result, studies of different types of entanglement are an important 
component of quantum information theory. 

Gate. An operation applied to information for the purpose of information processing. 

Global phase. Two quantum states are indistinguishable if they differ only by a global phase. That is, 
\ip) and e l ^\ip) are in essence the same state. The global phase difference is the factor e*^. The 
equivalence of the two states is apparent from the fact that their density matrices are the same. 

Hilbert space. An n-dimensional Hilbert space consists of all complex n-dimensional vectors. A defining 
operation in a Hilbert space is the inner product. If the vectors are thought of as column vectors, 
then the inner product (x, y) of x and y is obtained by forming the conjugate transpose x^ of x and 
calculating (x, y) = x^y. The inner product induces the usual squared norm \x\ 2 = (x,x). 

Information. Something that can be recorded, communicated, and computed with. Information is fun- 
gible; that is, its meaning can be identified regardless of the particulars of the physical realization. 
Thus, information in one realization (such as ink on a sheet of paper) can be easily transferred to 
another (for example, spoken words). Types of information include deterministic, probabilistic and 
quantum information. Each type is characterized by "information units", which are abstract systems 
whose states represent the simplest information of each type. The information units define the "nat- 
ural" representation of the information. For deterministic information the information unit is the 
bit, whose states are symbolized by o and 1. Information units can be put together to form larger 
systems and can be processed with basic operations acting on a small number of them at a time. 

Inner product. The defining operation of a Hilbert space. In a finite dimensional Hilbert space with a 
chosen orfhonormal basis {e^ : 1 < i < n}, the inner product of two vectors x = Yli x i e i an ^ 
y = YliiVi e i i s gi ven by Yli^iUi- I n the standard column representation of the two vectors, this 
is the number obtained by computing the product of the conjugate transpose of x with y. For real 
vectors, this agrees with the usual "dot" product. The inner product of x and y is often written in the 
form (x, y). Pure quantum states are unit vectors in a Hilbert space. If \cp) and \ip) are two quantum 
states expressed in the ket-bra notation, there inner product is given by (|0))* |^) = 

Ket. A state expression of the form \ip) representing a quantum state. Usually \ip) is thought of as a 
superposition of members of a logical state basis \i). One way to think about the notation is to 
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consider the two symbols "|" and ")" as delimiters denoting a quantum system and ip as a symbol 
representing a state in a standard Hilbert space. The combination \ip) is the state of the quantum 
system associated with ip in the standard Hilbert space via a fixed isomorphism. In other words, one 
can think of ip <-> |^) as an identification of the quantum system's state space with the standard 
Hilbert space. 

Linear extension of an operator. The unique linear operator that implements a map defined on a basis. 

Typically, we define an operator f/ona quantum system only on the logical states U : \i) > \4>i)- 

The linear extension is defined by U - ai\i)) = ^ a^i). 
Logical states. For quantum systems used in information processing, the logical states are a fixed or- 

thonormal basis of pure states. By convention, the logical basis for qubits consists of |o) and |i). 

For larger dimensional quantum systems, the logical basis is often indexed by the whole numbers, 

|0), |1), |2), . . .. The logical basis is often also called the "computational" basis, or sometimes, the 

"classical" basis. 

Measurement. The process used to extract classical information from a quantum system. A general 
projective measurement is defined by a set of projectors p satisfying £\ Pi — i and PjP, = ^p. 
Given the quantum state [tp), the outcome of a measurement with the set {P}i is one of the classical 
indeces i associated with a projector p. The index i is the measurement outcome. The probability 
of outcome i is pi = \Pi\ipi) | 2 , and given outcome i, the quantum state "collapses" to Pi\4>i) / ^/pl. 

Mixture. A probabilistic combination of pure states of a quantum system. Mixtures can be represented 
without redundancy with density operators. Thus a mixture is of the form £\ Aj 1^0(^1' w ^ m 
K > 0, J2i A» = 1 being the probabilities of the states \ipi). This expression for mixtures defines the 
set of density operators, which can also be characterized as the set of operators p satisfying tr(p) = 1 
and for all \ip), (ip\p\ip) > ("positive semidefinite operator"). 

Network. In the context of information processing, a network is a sequence of gates applied to specified 
information units. We visualize networks by drawing horizontal lines to denote the time line of an 
information unit. The gates are represented by graphical elements that intercept the lines at specific 
points. A realization of the network requires applying the gates to the information units in the 
specified order (left to right). 

Operator. A function that transforms the states of a system. Operators may be restricted depending on 
the system's properties. For example, in talking about operators acting on quantum systems, one 
always assumes that they are linear. 

Oracle. An information processing operation that can be applied. A use of the oracle is called a "query". 
In the oracle model of computation, a standard model is extended to include the ability to query 
an oracle. Each oracle query is assumed to take one time unit. Queries can reduce the resources 
required for solving problems. Usually, the oracle implements a function or solves a problem not 
efficiently implementable by the model without the oracle. Oracle models are used to compare the 
power of two models of computation when the oracle can be defined for both models. For example, 
in 1994, D. Simon showed that quantum computers with a specific oracle O could efficiently solve 
a problem that had no efficient solution on classical computers with access to the classical version 
of O. At the time, this result was considered to be the strongest evidence for an exponential gap in 
power between classical and quantum computers. 

Overlap. The inner product between two quantum states. 
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Pauli operators. The Hermitian matrices a X} a yi a z acting on qubits, which are two-level quantum sys- 
tems. They are defined in Eq. 12. It is often convenient to consider the identity operator to be 
included in the set of Pauli operators. 

Polynomial resources. To say that an algorithm computing the function f(x), where x is a bit string, 
uses polynomial resources (in orther words, "is efficient") means that the number of steps required 
to compute f(x) is bounded by \x\ k for some fixed k. Here \x\ denotes the length of the bit string x. 

Probabilistic bit. The basic unit of probabilistic information. It is a system whose state space consists of 
all probability distributions over the two states of a bit. The states can be thought of as describing 
the outcome of a biased coin flip before the coin is flipped. 

Probabilistic information. The type of information obtained by extending the state spaces of determin- 
istic information to include arbitrary probability distributions over the deterministic states. This is 
the main type of classical information to which quantum information is compared. 

Probability amplitude. The squared norm of an amplitude with respect to a chosen orthonormal basis 
{\i)}. Thus, the probability amplitude is the probability with which the state \i) is measured in a 
complete measurement that uses this basis. 

Product state. For two quantum systems A and B, product states are of the form \ip) |0) b . Most states are 
not of this form. 

Program. An algorithm expressed in a language that can be understood by a particular type of computer. 
Projection operator. A linear operator P on a Hilbert space that satisfies P 2 = P^P = P. The projection 
onto a subspace V with orthogonal complement W is defined as follows: If x E V and y £ W, then 

P(x + y) = x. 

Pseudo-code. An semi-formal computer language that is intended to be executed by a standard "ran- 
dom access machine", which is a machine model with a central processing unit and access to a 
numerically indexed unbounded memory. This machine model is representative of the typical one- 
processor computer. Pseudo-code is similar to programming languages such as BASIC, Pascal, or 
C, but does not have specialized instructions for human interfaces, file management, or other "ex- 
ternal" devices. Its main use is to describe algorithms and enable machine-independent analysis of 
the algorithms' resource usage. 

Pure state. A state of a quantum system that corresponds to a unit vector in the Hilbert space used to 
represent the system's state space. In the ket notation, pure states are written in the form \ip) = 
^jttilO' where the \i) form a logical basis and £\ |ctj| 2 = 1. 

Quantum information. The type of information obtained when the state space of deterministic informa- 
tion is extended by normalized superpositions of deterministic states. Formally, each deterministic 
state is identified with one of an orthonormal basis vector in a Hilbert space and normalized su- 
perpositions are unit-length vectors that are expressible as complex linear sums of the chosen basis 
vectors. It is convenient to extend this state space further by permitting probability distributions over 
the quantum states (see the entry for "mixtures"). This extension is still called quantum information. 

Qubit. The basic unit of quantum information. It is the quantum extension of the deterministic bit, which 
implies that its state space consists of the unit-length vectors in a two dimensional Hilbert space. 

Read-out. A method for obtaining human-readable information from the state of a computer. For quan- 
tum computers, read-out refers to a measurement process used to obtain classical information about 
a quantum system. 
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Reversible gate. A gate whose action can be undone by a sequence of gates. 
Separable state. A mixture of product states. 

States. The set of states for a system characterizes the system's behavior and possible configurations. 

Subspace. For a Hilbert space, a subspace is a linearly closed subset of the vector space. The term can be 
used more generally for a system Q of any information type: A subspace of Q or, more specifically, 
of the state space of Q is a subset of the state space that preserves the properties of the information 
type represented by Q. 

Superposition principle. One of the defining postulates of quantum mechanics according to which if 
states |1), |2), . . . are distinguishable then wmi 12i \ a i\ 2 = 1 is a valid quantum state. 

Such a linear combination is called a normalized superposition of the states \i). 

System. An entity that can be in any of a specified number of states. An example is a desktop computer 
whose states are determined by the contents of its various memories and disks. Another example 
is a qubit, which can be thought of as a particle whose state space is identified with complex, two- 
dimensional, length-one vectors. Here, a system is always associated with a type of information that 
determines the properties of the state space. For example, for quantum information the state space 
is a Hilbert space. For deterministic information, it is a finite set called an alphabet. 

Unitary operator. A linear operator U on a Hilbert space that preserves the inner product. That is, 
(Ux, Uy) = (x, y).\fU is given in matrix form, then this expression is equivalent to WU = I. 

Universal set of gates. A set of gates that satisfies the requirement that every allowed operation on infor- 
mation units can be implemented by a network of these gates. For quantum information, it means a 
set of gates that can be used to implement every unitary operator. More generally, a set of gates is 
considered universal if for every operator U, there are implementable operators V arbitrarily close 
to U. 
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